Collocations in a Rule-Based MT System: A Case Study Evaluation of Their Translation Adequacy

نویسندگان

  • Eric Wehrli
  • Violeta Seretan
  • Luka Nerima
  • Lorenza Russo
چکیده

Collocations constitute a subclass of multi-word expressions that are particularly problematic for machine translation, due 1) to their omnipresence in texts, and 2) to their morpho-syntactic properties, allowing virtually unlimited variation and leading to long-distance dependencies. Since existing MT systems incorporate mostly local information, these are arguably ill-suited for handling those collocations whose items are not found in close proximity. In this article, we describe an integrated environment in which collocations (and possibly their translation equivalents) are first identified from text corpora and stored in the lexical database of a translation system, then they are employed by this system, which is capable of dealing with syntactic transformations as it is based on a deep linguistic approach. We compare the performance of our system (in terms of collocation translation adequacy) with that of two major MT systems, one statistical, and the other rule-based. Our results confirm that syntactic variation affects translation quality and show that a deep syntactic approach is more robust in this sense, especially for languages with freer word order (e.g., German) and richer morphology (e.g., Italian) than English.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Can Automatic Post-Editing Make MT More Meaningful?

Automatic post-editors (APEs) enable the re-use of black box machine translation (MT) systems for a variety of tasks where different aspects of translation are important. In this paper, we describe APEs that target adequacy errors, a critical problem for tasks such as cross-lingual question-answering, and compare different approaches for post-editing: a rule-based system and a feedback approach...

متن کامل

Towards the Automatic Acquisition of Lexical Selection Rules

This paper is a study of a certain type of collocations and implication and application to acquisition of lexical selection rules in transfer-approach MT systems. Collocations reveal the co-occurrence possibilities of linguistic units in one language, which often require lexical selection rules to enhance the natural flow and clarity of MT output. The study presents an automatic acquisition and...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

Investigating the Relationship Between Iranian EFL Learners’ Use of Strategies in Collocating Words and Their Proficiency Level

This study investigated the relationship between Iranian EFL learners’ use of strategies in producing English collocations and their proficiency level. Participants were 115 undergraduate university students at 3 proficiency levels, that is, low, intermediate, and high, majoring in English language at the Faculty of Letters and Humanties at Shahid Chamran University of Ahvaz, Iran. Their select...

متن کامل

Evaluating MT output with entailment technology

Constant evaluation is vital to the progress of machine translation. However, human evaluation is costly, time-consuming, and difficult to do reliably. On the other hand, automatic measures of machine evaluation performance (such as BLEU, NIST, TER, and METEOR), while cheap and objective, have increasingly come under suspicion as to whether they are satisfactory measuring instruments. Recent wo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009